Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON.generate: call to_json on String subclasses #668

Merged
merged 1 commit into from
Oct 31, 2024

Conversation

casperisfine
Copy link

Fix: #667

This is yet another behavior on which the various implementations differed, but the C implementation used to call to_json on String subclasses used as keys.

This was optimized out in e125072 but there is an Active Support test case for it, so it's best to make all 3 implementation respect this behavior.

FYI: @mtasaka

Fix: ruby#667

This is yet another behavior on which the various implementations
differed, but the C implementation used to call `to_json` on String
subclasses used as keys.

This was optimized out in e125072
but there is an Active Support test case for it, so it's best to
make all 3 implementation respect this behavior.
@byroot byroot merged commit 96397cf into ruby:master Oct 31, 2024
36 checks passed
@eregon
Copy link
Member

eregon commented Oct 31, 2024

This seems a bit unfortunate performance-wise.
I wonder why EscapedString inherits from String, if it didn't it would just work without special care, no?
Or the escaping could be done eagerly directly in jsonify.

@byroot
Copy link
Member

byroot commented Oct 31, 2024

I think the performance cost is very minor, just an extra pointer comparison on the class, that's tagged as unlikely, so probably predicted out.

I didn't see much change on the benchmarks.

@eregon
Copy link
Member

eregon commented Oct 31, 2024

In lib/json/pure/generator.rb generate_json it seems quite a bit of extra code and checks, which seems likely to hurt dump perf quite a bit.

@casperisfine
Copy link
Author

My assumption is the pure generator is only really used with Truffle or in context where perf was disregarded. Have you measured the perf degradation on Truffle, is it really substantial? I'd expect Truffle to be able to optimize most of that out.

@eregon
Copy link
Member

eregon commented Oct 31, 2024

I'll check it out later.

@eregon
Copy link
Member

eregon commented Oct 31, 2024

Before this PR (6d3b3ac):

$ ruby benchmark/standalone.rb dump pure
JSON::Pure::Generator
truffleruby 24.1.1, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Warming up --------------------------------------
      JSON.dump(obj)   350.000 i/100ms
      JSON.dump(obj)   745.000 i/100ms
      JSON.dump(obj)   829.000 i/100ms
      JSON.dump(obj)   799.000 i/100ms
      JSON.dump(obj)   836.000 i/100ms
Calculating -------------------------------------
      JSON.dump(obj)      8.152k (± 9.8%) i/s  (122.66 μs/i) -     40.128k in   5.008035s
      JSON.dump(obj)      8.166k (± 9.3%) i/s  (122.46 μs/i) -     40.964k in   5.084684s
      JSON.dump(obj)      8.180k (± 6.6%) i/s  (122.26 μs/i) -     40.964k in   5.038216s
      JSON.dump(obj)      8.101k (± 9.2%) i/s  (123.43 μs/i) -     40.128k in   5.020143s
      JSON.dump(obj)      8.156k (± 7.0%) i/s  (122.60 μs/i) -     40.964k in   5.056036s

$ ruby benchmark/standalone.rb dump pure
JSON::Pure::Generator
truffleruby 24.1.1, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Warming up --------------------------------------
      JSON.dump(obj)   336.000 i/100ms
      JSON.dump(obj)   793.000 i/100ms
      JSON.dump(obj)   827.000 i/100ms
      JSON.dump(obj)   828.000 i/100ms
      JSON.dump(obj)   788.000 i/100ms
Calculating -------------------------------------
      JSON.dump(obj)      8.074k (± 8.6%) i/s  (123.85 μs/i) -     40.188k in   5.037323s
      JSON.dump(obj)      7.993k (± 9.8%) i/s  (125.11 μs/i) -     39.400k in   5.004141s
      JSON.dump(obj)      8.032k (± 9.8%) i/s  (124.51 μs/i) -     40.188k in   5.079140s
      JSON.dump(obj)      8.013k (± 8.8%) i/s  (124.79 μs/i) -     40.188k in   5.073014s
      JSON.dump(obj)      8.098k (± 9.2%) i/s  (123.49 μs/i) -     40.188k in   5.038510s

After this PR (96397cf):

$ ruby benchmark/standalone.rb dump pure
JSON::Pure::Generator
truffleruby 24.1.1, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Warming up --------------------------------------
      JSON.dump(obj)   363.000 i/100ms
      JSON.dump(obj)   557.000 i/100ms
      JSON.dump(obj)   647.000 i/100ms
      JSON.dump(obj)   605.000 i/100ms
      JSON.dump(obj)   580.000 i/100ms
Calculating -------------------------------------
      JSON.dump(obj)      6.405k (± 7.1%) i/s  (156.14 μs/i) -     31.900k in   5.016689s
      JSON.dump(obj)      6.387k (± 9.6%) i/s  (156.57 μs/i) -     31.900k in   5.077313s
      JSON.dump(obj)      6.437k (± 5.6%) i/s  (155.34 μs/i) -     32.480k in   5.067145s
      JSON.dump(obj)      6.394k (± 8.5%) i/s  (156.40 μs/i) -     31.900k in   5.045133s
      JSON.dump(obj)      6.402k (± 7.7%) i/s  (156.20 μs/i) -     31.900k in   5.022605s

$ ruby benchmark/standalone.rb dump pure
JSON::Pure::Generator
truffleruby 24.1.1, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Warming up --------------------------------------
      JSON.dump(obj)   311.000 i/100ms
      JSON.dump(obj)   557.000 i/100ms
      JSON.dump(obj)   617.000 i/100ms
      JSON.dump(obj)   579.000 i/100ms
      JSON.dump(obj)   589.000 i/100ms
Calculating -------------------------------------
      JSON.dump(obj)      6.444k (± 7.8%) i/s  (155.18 μs/i) -     32.395k in   5.069218s
      JSON.dump(obj)      6.436k (± 7.9%) i/s  (155.37 μs/i) -     32.395k in   5.073131s
      JSON.dump(obj)      6.521k (± 6.3%) i/s  (153.36 μs/i) -     32.984k in   5.083403s
      JSON.dump(obj)      6.534k (± 5.7%) i/s  (153.04 μs/i) -     32.984k in   5.071127s
      JSON.dump(obj)      6.483k (±10.8%) i/s  (154.24 μs/i) -     31.806k in   5.023239s

@eregon
Copy link
Member

eregon commented Oct 31, 2024

I tried to undo the changes incrementally and the main slowdown seems from using:

klass = obj.class
if klass == Hash
elsif klass == Array
elsif klass == String
...

(I also tried to reverse that == check but it didn't seem to change perf much, the reasoning is klass->klass is polymorphic, it's different singleton classes)

vs

case obj
when Hash
when Array
when String
...

I wonder if maybe it's simply obj.class which is expensive, because it's called on objects of various classes and the method lookup needs to get the obj->klass and caches on that, but that will be a polymorphic inline cache.
OTOH Module#=== as in when Hash doesn't have that polymorphism, because it's called on a constant module (there are still some branches to handle primitives like int/boolean/double, but for all non-primitives it's just a field read).

@byroot
Copy link
Member

byroot commented Oct 31, 2024

And how fast is the C implementation on Truffle?

Because on my machine:

$ ruby --yjit -Ilib:ext benchmark/standalone.rb dump
JSON::Ext::Generator
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
      JSON.dump(obj)     3.295k i/100ms
Calculating -------------------------------------
      JSON.dump(obj)     33.306k (± 0.6%) i/s   (30.02 μs/i) -    168.045k in   5.045675s

So maybe Truffle should just revert back to use the C extension?

eregon added a commit to eregon/json that referenced this pull request Nov 1, 2024
* if/elsif comparing `obj.class` is significantly slower:
  ruby#668 (comment)
* The only case where an exact class check is needed so far is for String (ruby#667).
* Before: $ ruby -Ilib:ext benchmark/standalone.rb dump pure
JSON::Pure::Generator
truffleruby 24.2.0-dev-07b978e4, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Calculating -------------------------------------
      JSON.dump(obj)      6.426k (± 5.9%) i/s  (155.62 μs/i) -     32.395k in   5.064479s
      JSON.dump(obj)      6.380k (± 7.4%) i/s  (156.73 μs/i) -     31.806k in   5.021304s
      JSON.dump(obj)      6.276k (±10.5%) i/s  (159.33 μs/i) -     31.217k in   5.060762s
      JSON.dump(obj)      6.450k (± 7.0%) i/s  (155.05 μs/i) -     32.395k in   5.059538s
      JSON.dump(obj)      6.413k (± 6.2%) i/s  (155.93 μs/i) -     32.395k in   5.081573s
* After: $ ruby -Ilib:ext benchmark/standalone.rb dump pure
JSON::Pure::Generator
truffleruby 24.2.0-dev-07b978e4, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Calculating -------------------------------------
      JSON.dump(obj)      8.237k (± 5.0%) i/s  (121.41 μs/i) -     41.600k in   5.069507s
      JSON.dump(obj)      8.179k (± 5.1%) i/s  (122.26 μs/i) -     40.768k in   5.002035s
      JSON.dump(obj)      8.147k (± 7.9%) i/s  (122.74 μs/i) -     40.768k in   5.044840s
      JSON.dump(obj)      8.137k (± 6.9%) i/s  (122.90 μs/i) -     40.768k in   5.048690s
      JSON.dump(obj)      8.112k (±10.2%) i/s  (123.27 μs/i) -     39.936k in   5.023502s
@eregon
Copy link
Member

eregon commented Nov 1, 2024

Unfortunately the C extension is still much slower, needs some investigation & profiling:
(using master...eregon:json:truffleruby-use-generator-cext)

$ ruby --experimental-options --cexts-panama -Ilib:ext benchmark/standalone.rb dump ext
JSON::Ext::Generator
truffleruby 24.2.0-dev-07b978e4, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Warming up --------------------------------------
      JSON.dump(obj)    46.000 i/100ms
      JSON.dump(obj)    55.000 i/100ms
      JSON.dump(obj)    56.000 i/100ms
      JSON.dump(obj)    56.000 i/100ms
      JSON.dump(obj)    55.000 i/100ms
Calculating -------------------------------------
      JSON.dump(obj)    562.849 (± 1.1%) i/s    (1.78 ms/i) -      2.860k in   5.081942s
      JSON.dump(obj)    562.589 (± 0.5%) i/s    (1.78 ms/i) -      2.860k in   5.083810s
      JSON.dump(obj)    562.587 (± 0.7%) i/s    (1.78 ms/i) -      2.860k in   5.083864s
      JSON.dump(obj)    563.702 (± 0.5%) i/s    (1.77 ms/i) -      2.860k in   5.073772s
      JSON.dump(obj)    561.796 (± 1.4%) i/s    (1.78 ms/i) -      2.860k in   5.091997s

Compared to after #674:

$ ruby -Ilib:ext benchmark/standalone.rb dump pure
    JSON::Pure::Generator
    truffleruby 24.2.0-dev-07b978e4, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
    Calculating -------------------------------------
          JSON.dump(obj)      8.237k (± 5.0%) i/s  (121.41 μs/i) -     41.600k in   5.069507s
          JSON.dump(obj)      8.179k (± 5.1%) i/s  (122.26 μs/i) -     40.768k in   5.002035s
          JSON.dump(obj)      8.147k (± 7.9%) i/s  (122.74 μs/i) -     40.768k in   5.044840s
          JSON.dump(obj)      8.137k (± 6.9%) i/s  (122.90 μs/i) -     40.768k in   5.048690s
          JSON.dump(obj)      8.112k (±10.2%) i/s  (123.27 μs/i) -     39.936k in   5.023502s

To have a baseline:

$ ruby --yjit -Ilib:ext benchmark/standalone.rb dump
JSON::Ext::Generator
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) +YJIT [x86_64-linux]
Warming up --------------------------------------
      JSON.dump(obj)     1.918k i/100ms
Calculating -------------------------------------
      JSON.dump(obj)     19.110k (± 1.1%) i/s   (52.33 μs/i) -     95.900k in   5.018895s

casperisfine pushed a commit to casperisfine/json that referenced this pull request Nov 4, 2024
Ref: ruby#674
Ref: ruby#668

The behavior on such case it quite unclear, the goal here is to
figure out whatever was the behavior on Cext version of `json 2.7.0`
and get all implementations to converge.

We can then decide to make them all behave differently if we so wish.
casperisfine pushed a commit to casperisfine/json that referenced this pull request Nov 4, 2024
Ref: ruby#674
Ref: ruby#668

The behavior on such case it quite unclear, the goal here is to
figure out whatever was the behavior on Cext version of `json 2.7.0`
and get all implementations to converge.

We can then decide to make them all behave differently if we so wish.
casperisfine pushed a commit to casperisfine/json that referenced this pull request Nov 4, 2024
Ref: ruby#674
Ref: ruby#668

The behavior on such case it quite unclear, the goal here is to
figure out whatever was the behavior on Cext version of `json 2.7.0`
and get all implementations to converge.

We can then decide to make them all behave differently if we so wish.
eregon added a commit to eregon/json that referenced this pull request Nov 4, 2024
* if/elsif comparing `obj.class` is significantly slower:
  ruby#668 (comment)
* The only case where an exact class check is needed so far is for String (ruby#667).
* Before: $ ruby -Ilib:ext benchmark/standalone.rb dump pure
JSON::Pure::Generator
truffleruby 24.2.0-dev-07b978e4, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Calculating -------------------------------------
      JSON.dump(obj)      6.426k (± 5.9%) i/s  (155.62 μs/i) -     32.395k in   5.064479s
      JSON.dump(obj)      6.380k (± 7.4%) i/s  (156.73 μs/i) -     31.806k in   5.021304s
      JSON.dump(obj)      6.276k (±10.5%) i/s  (159.33 μs/i) -     31.217k in   5.060762s
      JSON.dump(obj)      6.450k (± 7.0%) i/s  (155.05 μs/i) -     32.395k in   5.059538s
      JSON.dump(obj)      6.413k (± 6.2%) i/s  (155.93 μs/i) -     32.395k in   5.081573s
* After: $ ruby -Ilib:ext benchmark/standalone.rb dump pure
JSON::Pure::Generator
truffleruby 24.2.0-dev-07b978e4, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
Calculating -------------------------------------
      JSON.dump(obj)      8.237k (± 5.0%) i/s  (121.41 μs/i) -     41.600k in   5.069507s
      JSON.dump(obj)      8.179k (± 5.1%) i/s  (122.26 μs/i) -     40.768k in   5.002035s
      JSON.dump(obj)      8.147k (± 7.9%) i/s  (122.74 μs/i) -     40.768k in   5.044840s
      JSON.dump(obj)      8.137k (± 6.9%) i/s  (122.90 μs/i) -     40.768k in   5.048690s
      JSON.dump(obj)      8.112k (±10.2%) i/s  (123.27 μs/i) -     39.936k in   5.023502s
casperisfine pushed a commit to Shopify/ruby that referenced this pull request Nov 5, 2024
…es subclasses

Ref: ruby/json#674
Ref: ruby/json#668

The behavior on such case it quite unclear, the goal here is to
figure out whatever was the behavior on Cext version of `json 2.7.0`
and get all implementations to converge.

We can then decide to make them all behave differently if we so wish.

ruby/json@614921dcef
casperisfine pushed a commit to Shopify/ruby that referenced this pull request Nov 5, 2024
…es subclasses

Ref: ruby/json#674
Ref: ruby/json#668

The behavior on such case it quite unclear, the goal here is to
figure out whatever was the behavior on Cext version of `json 2.7.0`
and get all implementations to converge.

We can then decide to make them all behave differently if we so wish.

ruby/json@614921dcef
byroot added a commit to ruby/ruby that referenced this pull request Nov 5, 2024
…es subclasses

Ref: ruby/json#674
Ref: ruby/json#668

The behavior on such case it quite unclear, the goal here is to
figure out whatever was the behavior on Cext version of `json 2.7.0`
and get all implementations to converge.

We can then decide to make them all behave differently if we so wish.

ruby/json@614921dcef
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

active_support test began to fail beginning with json 2.7.3
3 participants